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FIRST PASSAGE PERCOLATION ON THE NEWMAN-WATTS 
SMALL WORLD MODEL 

JULIA KOMJATHYi’* AND VIKTORIA VADON^ 


Abstract. The Newman-Watts model is given by taking a cycle graph of n vertices and 
then adding each possible edge \i — i| 1 mod n with probability p/n for some 

p > 0 constant. In this paper we add i.i.d. exponential edge weights to this graph, and 
investigate typical distances in the corresponding random metric space given by the least 
weight paths between vertices. We show that typical distances grow as ^ log n for a A > 0 
and determine the distribution of smaller order terms in terms of limits of branching 
process random variables. We prove that the number of edges along the shortest weight 
path follows a Central Limit Theorem, and show that in a corresponding epidemic spread 
model the fraction of infected vertices follows a deterministic curve with a random shift. 


1. The model and main results 

1.1. The Newman-Watts model. The Newman-Watts small world model, often referred 
to as “small world” in short, is one of the first random graph models created to model real-life 
networks. It was introduced by Watts and Strogatz [34], and a simplifying modification was 
made by Newman and Watts [29] later. The Newman-Watts model consist of a cycle on n 
vertices, each connected to the fc > 1 nearest vertices, and then extra shortcut edges are 
added in a similar fashion to the creation of the Erdos-Renyi graph [20]: i.e., for each pair of 
not yet connected vertices, we connect them independently with probability p. 

The model has been studied from different aspects. Newman et al. studied distances 
[30, 31] with simulations and mean-field approximation, as well as the threshold for a large 
outbreak of the spread of non-deterministic epidemics [28]. Barbour and Reinert treated 
typical distances rigorously. First, in [6], they studied a continuous circle with circumference 
n instead of a cycle on n many vertices, and added Poi(7Tp/2) many 0-length shortcuts at 
locations chosen according to the uniform measure on the circle. Then, in [7], they studied 
the discrete model, with all edge lengths equal to 1. They showed that typical distances in 
both models scale as log n. 

Besides typical distances, the mixing time of simple random walk on the Newman-Watts 
model was also studied, i.e., the time when the distribution of the position of the walker gets 
close enough to the stationary distribution in total variation distance. Durrett [19] showed 
that the order of the mixing time is between (logn)^ and (logn)^, then Addario-Berry and 
Lei [1] proved that Durett’s lower bound is sharp. 
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1.2. Main results. We work on the Newman-Watts small world model [29] with independent 
random edge weights: we take a cycle Cn on n vertices, that we denote by [n] := { 1 , 2 ,..., n}, 
and each edge (i,j) G [n], \i — j| = 1 mod n is present. Then independently for each 
i,j G [n], \i — jj ^ 1 mod n we add the edge (i,j) with probability p/n to form shortcut 
edges. The parameter p is the asymptotic average number of shortcuts from a vertex. 
Conditioned on the edges of the resulting graph, we assign weights that are i.i.d. exponential 
random variables with mean 1 to the edges. We denote the weight of edge e by Xg. We 
write NW„(p) for a realization of this weighted random graph. 

We define the distance between two vertices in NW„(p) as the sum of weights along the 
shortest weight path connecting the two vertices. In this respect, the weighted graph with 
this distance function is a (non-Euclidean) random metric space. Further, interpreting the 
edge weights as time or cost, the distance between two vertices can also correspond to the 
time it takes for information to spread from one vertex to the other on the network, or it 
can model the cost of transmission between the two vertices. 

We say that a sequence of events {fnlneN happens with high probability (w.h.p.) if 
lim„_>oo ]P(£n) = 1 , that is, the probability that the event holds tends to 1 as the size of 
the graph tends to infinity. We write Bin, Poi, Exp for binomial, Poisson, and exponential 
distributions. For random variables {X„}„gN,X, we write ^ X if tends to X in 
distribution as n —>■ oo. The moment generating function of a random variable X is the 
function Mx('d) := E[exp{—i?X}]. 

Our first result is about typical distances in the weighted graph. Let P^- denote the set 
of all paths 7 in NW„(p) between two vertices i,j G [nj. Then the weight of the shortest 
weight path is defined by 

Vnii, j) ■■= min ^ Xe. (1.1) 

eG7 

Theorem 1.1 (Typical distances). Let U,V he two uniformly chosen vertices in [nj. Then, 
the distance Vn(U,V) in NW„(p) with i.i.d. Exp(l) edge weights satisfies w.h.p. 

rn{U,V) - jlogn ^(log W^W^+A + c), 

where A is the largest root of the polynomial p{x) = - 1 - (1 — p)x — 2p, A is a standard 

Gumbel random variable, the random variables W^,W^ are independent copies of the 
martingale limit of the multi-type branching process defined below in Section 2.3, and c := 
log(l — 7 r^/ 2 ) — log(A(A -I- 1)) with ttr = 2/(A -I- 2). 

Let us write 7 * = 7 *(*, j) for the path that minimizes the weighted distance in (1.1). We 
call 'Hn{U, V) := |7*(17, E)| the hopcount, i.e., the number of edges along the shortest-weight 
path between two uniformly chosen vertices. 

Theorem 1.2 (Central Limit Theorem for the hopcount). Let U, V be two uniformly chosen 
vertices in [n]. Then, the hopcount TLn{U,V) in NW„(p) with i.i.d. Exp(l) edge weights 
satisfies w.h.p. 

nn{U,V)-^\ogn d 

j - ^ 

^^logn 

where Z is a standard normal random variable. 

Our next result characterises the proportion of vertices within distance t away from a 
uniformly chosen vertex 17 as a function of t. To put this result into perspective, note that 
we can model the spread of information starting from some source set Iq C [n] at time 
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i = 0 as follows: We assume that once a vertex v receives the information at time t, it 
starts transmitting the information towards all its neighbors at rate 1. Let us denote the 
vertices that are connected to v by an edge by H(u), then, for each w G H(v), w receives 
the information from v at time t + We further assume that transmission happens 

only after the first receipt of the information, that is, any consecutive receipts are ignored. 
If instead of the spread of information spread, we model the spread of a disease, this model 
is often called an S'/-epidemic (susceptible-infected). 

In the next theorem we consider this epidemic spread model from a single source Iq = {U} 
on NW„(p) with i.i.d. Exp(l) transmission times. We define 

is infected before or at time t} = —#{f : i G [n],Tijj < t}, (1.2) 

iGfn] 

the fraction of infected vertices at time t of the epidemic started from the vertex U. 

Theorem 1.3 (Epidemic curve). Let U be a uniformly chosen vertex in [n], and let us 
consider the epidemic spread with source U and i.i.d. Exp(l) transmission times on NW„(p). 
Then, the proportion of infected individuals satisfies w.h.p. 

In(< + y log n, U) f{t+ \ log Wu), 

where f{t) = 1 — M.y/v {^{t)), where M\Yv{-) is the moment generating function ofWy, and 
xit) = (1 - ^TT^) + l))j with ttr = 2/{\ + 2); and where Wjj^Wv, are the same 

random variables as in Theorem 1.1. 


Remark 1.4. The intuitive message of Theorem 1.3 is that a linear proportion of infected 
vertices can be observed after a time that is proportional to the logarithm of the size of the 
population. This time has a random shift given by y logWu- Besides this random shift, the 
fraction of infected individuals follows a deterministic curve f{-)'. only the ‘position of the 
curve’ on the time-axes is random. A bigger value of Wjj means that the local neighborhood 
of U is “dense”, and hence the spread is quick in the initial stages: indeed, a bigger value of 
Wu shifts the function /(t-t-(log W( 7 )/A) more to the left on the time axes. This phenomenon 
has been observed in real-life epidemics, see e.g. [2, 33] for a characterisation of typical 
epidemic curve shapes. For individual epidemic curves, browse e.g. [17]. 

The next proposition characterises the function Mwv (t) in definition of the epidemic 
curve function f{t) in Theorem 1.3. 


Proposition 1.5 (Functional equation system for the moment generating function). The 
moment generating function Mwv('*^)if^ € I®’*' of the random variable Wy satisfies the 
following functional equation system, with Mwvi'^) ■= 


M^iB){d)=(^j “dx^ -explp-y {My^(R){de ^‘^) - l) e ‘“dxj , 

Myy(R){d)=J M^y(H)(■de“'^“)e“‘^dx • expjp • y (de”'^’') - l) e“‘^dx|. 


(1.3) 


Remark 1.6. These functional equations and the fact that there exists a solution for all 
d G K’*' follow from the usual branching recursion of multi-type branching processes, that 
can be found e.g. in [5]. 
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1.3. Related literature, comparison and context. First passage percolation (FPP) was 
first introduced by Hammersley and Welsh [21] to study spreading dynamics on lattices, in 
particular on Z‘^,d > 2. The intuitive idea behind the method is that one imagines water 
flowing at a constant rate through the (random) medium, the waterfront representing the 
spread. The model turned out to be able to capture the core idea of several other processes, 
such as weighted graph distances and epidemic spreads. 

Janson [24] studied typical distances and the corresponding hopcount, flooding times 
as well as diameter of FPP on the complete graph. He showed that typical distances, the 
flooding time and diameter converge to 1, 2, and 3 times logn/n, respectively, while the 
hopcount is of order logn. 

Universality class. In a sequence of papers (e.g. [11, 12, 10, 23]) van der Hofstad et al. 
investigated FPP on random graphs. Their aim was to determine universality classes for 
the shortest path metric for weighted random graphs without ‘extrinsic’ geometry (e.g. the 
supercritical Erdds-Renyi random graph, the configuration model, or rank-1 inhomogeneous 
random graphs). They showed that typical distances and the hopcount scale as logn, as long 
as the degree distribution has finite asymptotic variance and the edge weights are continuous 
on [0, oo). On the other hand, power-law degrees with infinite asymptotic variance drastically 
change the metric and there are several universality classes, compare [23] with [10]. In this 
respect. Theorems 1.1 and 1.2 show that the presence of the circle does not modify the 
universality class of the model. 

Comparison to the Erdos-Renyi graph. Notice that the subgraph formed by shortcut 
edges is approximately an Erdds-Renyi graph, with the difference that the presence of the 
cycle always makes NW„(p) connected and hence there is no subcritical or critical regime 
in NWn(/o). Typical distances on the Erdds-Renyi graph with parameter pjn and Exp(l) 
edge weights scale as \ogn/{p — 1) [11], while for NW„(p) they scale as (logn)/A, with 
A = (p — 1 -b ^/p^^^^^^p^^^)/2 > p — 1 for all p S K. This means that when p > 1, the 
presence of the cycle makes typical distances shorter, and this appears already in the constant 
scaling factor of logn. However, A(p)/p —>■ 1 as p —>■ oo meaning that the effect of the cycle 
becomes more and more negligible as the number of shortcut edges grow. 

Comparison to inhomogeneous random graphs. Kolossvary et al. [27] studied EPP on 
the inhomogeneous random graph model (IHRG), defined in [14]. In this model, vertices 
have types from a type space S, and conditioned on the types of the vertices, edges are 
present independently with probabilities that depend on the types. One can fine-tune the 
parameters of this model so that any finite neighborhood of a vertex in the NW„(p) model 
is similar to that of in the IHRG, that is, both of them can be modelled using the same 
continuous time multi-type branching process. It would be natural to conjecture that typical 
distances are then the same in these two models. It turns out that this is almost but not 
entirely the case: the first order term A“^ logn, and the random variables Wjj, Wy are the 
same, but the additive constant c in Theorem 1.1 is not: the geometry of the Newman-Watts 
model modihes how the two branching processes can connect to each other, which modifies 
the constant. Writing the main result in [27] in the same form as the one in Theorem 1.1, 
we obtain cihrg = log ((p + 2)(2p -b A^)/(p(A -b 2)^A(A -b 1)). 

The epidemic curve. In [13] Bhamidi et al. pointed out the connection between FPP, 
typical distances, and the epidemic curve by studying the epidemic spread on the configuration 
model with arbitrary continuous edge-weight distribution. Earlier, [8] Barbour and Reinert 
investigated the epidemic curve on the Erdds-Renyi random graph and on the configuration 
model with bounded degrees, where also possible other aspects such as contagious period of 
vertices or dependence of the transmission time distribution on the degrees might be present. 
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Possible future directions. In [3, 9, 18] the competition of two spreading processes running 
on the same graph is investigated. This can be considered a competition between two 
epidemics, as well as the word-of-mouth marketing of two similar products. The results 
suggest that the outcome depends on the universality class of the model: in ultra-small 
worlds, one competitor only gets a negligible part of the vertices, while on regular graphs 
coexistence might be possible, i.e., both colors can paint a linear fraction of vertices. Studying 
competition on NWji(/9) is an interesting and challenging future project. 

1.4. Structure of the paper. In what follows, we prove Theorems 1.1, 1.2 and 1.3. The 
brief idea of the proof is the following: we choose two vertices uniformly at random, then we 
start to explore the neighbourhoods of these vertices in the graph in terms of the distance 
from these vertices (Section 2). We show that this procedure w.h.p. results in ‘shortest 
weight trees’ (SWT’s) that can be coupled to two independent copies of a continuous time 
multi-type branching process (CMBP). We then handle how these two shortest weight trees 
connect in the graph in Section 3 with the help of a Poisson approximation. We provide the 
proof of Theorem 1.3 about the epidemic curve in Section 4 based on our result on distances. 
Finally we prove the Central Limit Theorem for the hopcount in Section 5, based on an 
indicator representation of the ‘generation of vertices’ in the branching processes. 

2. Exploration process 

To explore the neighborhood of a vertex, we use a modification of Dijkstra’s algorithm. 

Introduce the following notations: M{t), A{t), U{f) denote the set of explored (dead), 
active (alive) and unexplored vertices at time t, respectively, and N(t), A(t),U(t) for the 
sizes of these sets. The remaining lifetime of some vertex w G A{t) at time t is denoted 
by TZw (t) , and means that w will become explored exactly at time t -)- Rw (t) . The set of 
remaining lifetimes is As before, H(i;) denotes the neighbors of a vertex v. 

2.1. The exploration process on an arbitrary weighted graph. Let i = 1. The vertex 
from which we start the exploration process is denoted by vi. We color vi blue and set the 
time as t = Ti = 0. Evidently, we take 

A/'(0) = {ui}, A(0)=H(ni), Z^(0) = [n] \ ({ui} U H(?;i)). 

The remaining lifetimes are determined by the edge weights, i.e. 

'7^{A(o)}(0) = {RwiO) = for all w G H(ni)}. 

We color the active vertices w G Il(ni) to have the same color as the edge (vijw). 

We work with induction from now on. In each step, we increase i by 1. We can construct 
the continuous time process in steps, namely, at the random times when we explore a new 
vertex. 

Let Ti = min (7^{_4(7\_j)}(Ti_i)), the minimum of remaining lifetimes. Then define 
Ti := Ti-i + Ti^ the time when we explore the next vertex. Nothing changes in the time 
interval [Ti_i,Ti), hence for any t G [Ti_i,Ti), 

N{t) AA(r,_i), A{t) := A(T,_i), U{t) := 

From all the remaining lifetimes, we subtract the time passed: for some 0 < s < r^, 

'^{A(Ti_i)}(7i-i + s) := 7?.{_4(Ti_i)}(Tj_i) - s, 

subtracted element-wise. At time the vertex (or all the vertices, if there is more than 
one such vertices) Vi of which the remaining lifetime equals 0, becomes explored and its 
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neighbors become active. We shall refer to Vi as the explored vertex. We set 

M{Ti) := AA(r,_i) U A{Ti) := (Al(r,_i) \ {^;J) U H(r;,), U{Ti) := U{T,_i) \ 

We refresh the set of remaining lifetimes: 

T^{A{Ti)}{Ti) := Tl{A{Ti_i)}{Ti) \ {T^viiTi)} U {R.x{Ti) : x G H(wj)} 

where TZxiTi) = X(y.^x), the edge weight of (vi,x), and x also gets the color of {vi,x). 

On an arbitrary connected weighted graph, the exploration process can be continued until 
all vertices become explored. Note that this algorithm builds the shortest weight tree SWT 
from the starting vertex. This tree will be modeled using the branching process. 

Remark 2.1. The set of active vertices might contain several occurrences of a vertex, in 
case at least two neighbors of a vertex are explored already, see Figure 3. 



Figure 1. A realisation of the Newman-Watts model for k = 1 and p = 
1.1 with 60 vertices. On these two pictures, we illustrated the growing 
neighbourhood of a uniformly picked vertex. Circle edges are red and global 
edges are blue in the exploration. The edges that are partially red or blue 
are the ones that have an already explored vertex on one side while a not-yet 
explored (active) vertex on the other side. 


2.2. Exploration on the weighted Newman-Watts random graph. Note that when 
applying the exploration process above on a realization of NW„(p), we can reveal the presence 
of the edges and their weights in NW„(p) along the exploration process. In this respect, all 
the above quantities become random variables. Here we investigate the behaviour of this 
random exploration process. 

Let us color the cycle-edges red and the shortcut-edges blue, and let us say that a vertex is 
red/blue if it is reached first via a red/blue edge during the exploration. We allow double (or 
more) occurrences of the same vertex in [n] among the active vertices (with two remaining 
life-times along the two edges) in the exploration and also contradicting colorings. When such 
a vertex gets explored for the first time, it gets the color that corresponds to the remaining 
lifetime that became 0 (and forget about the other colors). Below, adding the subscript R 
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or B to any quantity corresponds to the same quantity restricted to only the red or blue 
vertices, respectively. 

When running the exploration process, we build a weighted tree along the process 
containing the edges that are used to discover the new vertices in the exploration (this is a 
tree since we do not explore vertices twice). This tree has root ui, grows in time, and at any 
time t it contains a vertex v G [nl precisely when V(vi,v) < t. Let us denote the tree up to 
time t by SWT"i(t). 

Claim 2.2 (Children). Suppose the vertex v is being explored for the first time (i.e., not 
”double-explored”). If v is red, one new red and Binomial(n — 3, ^) many new blue active 
vertices are bom. If v is blue, two new red and Binomial(n — 4, ^) many new blue active 
vertices are born. The number of new blue active vertices is asymptotically Poi(p) in both 
cases. Further, at any time t, the elements o/7?.{^(t)}(t) are i.i.d. Exp(l) random variables, 
and the next explored vertex is chosen uniformly over the set of active vertices. 

Proof. On a cycle there are two vertices neighboring a vertex, hence, if v is red, then it has 
been reached from one of his neighbors. The other one is added to the new red active vertices. 
If v is blue then it has been reached via a shortcut edge and hence both of its neighbors on 
the cycle are added to the new red active vertices. Since there are Bin(n — 3, p/n) many 
shortcut edges from a vertex, this is also the distribution of new blue active vertices born 
when exploring a red vertex. For the exploration of a blue vertex, we reached this vertex 
via a blue edge, hence an additional Bin(n — A, p/n) new active blue vertices. Clearly, by 
the convergence of binomial to Poisson distribution, each vertex has asymptotically Poi(/9) 
many blue neighbours. The second statement follows from the fact that the edge weights are 

1.1. d. exponential random variables, which has the memoryless property. Finally, note that at 

any time, 'R-{A{t)}(f) consists of i.i.d. exponential random variables, and the algorithm takes 
the minimum of these. Clearly, the minimum of finite many absolutely continuous random 
variables is unique almost surely, and uniform over the indices. □ 

2.3. Multi-type branching processes. We define the following continuous time multi¬ 
type branching process (CMBP) that will correspond to the initial stages of SWT(t). 

There are two particle types, red {R) and blue {B), and their lifetime is Fxp(l), independent 
from everything else. Particles give birth upon their death. They leave behind offspring as 
in Claim 2.2: each particle has Poi(p) many blue offspring, red particles have one, while blue 
particles have two red children. Dead and alive particles will correspond to explored and 
active vertices, respectively. With this wording, for the number of alive and dead particles, 
we define 

Definition 2.3. We shall write A(t) = (Anft), Ab/I)) for the number of alive particles of 
each type, A(t) standing for the total number of alive particles. Let N(t) = Nfl(t) -|- Nsit), 
where Nq(t) means the number of dead particles of type q = R,B. We assume the above 
quantities to be right-continuous. Superscripts (R), (B) refer to the process started with a 
single particle of the given type. 

The exploration process corresponds to the process started with a single blue-type particle, 
which dies immediately. 

2.3.1. Literature on multi-type branching processes. Here we restate the necessary theorems 
from [5] which we will use. 

Definition 2.4 (Mean matrix). Let M(t) := Mr,q(t) = E[Ag^^(t)], {q,r = R,B) the mean 
matrix, where A^\t) is as defined above in Definition 2.3. 
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It is not hard to see that M(t) satisfies the semigroup property M(t + s) = M(t)M(s) and 
the continuity condition lim(_>o M(t) = I, where I denotes the identity matrix. As a result, 
we have: 


Theorem 2.5 (Athreya-Ney). There exists an infinitesimal generator matrix Q such that 
M(t) = e^*, where Qr,q = arE,[Dq^'^] — dr,q- Here, Or is the rate of dying for a particle of 
type r, (i.e., the parameter of its exponential lifetime), D is the number of offspring with the 
same sub-end superscript conventions as in Definition 2.3, and Sr^q = l{T-=q} (i.e., Sr,q = 1 
if and only if r = q). 


Q = 


In our case, 

' 0 p 
2 p-1 

Eigenvalues and eigenvectors of the Q matrix. Using the characteristic polynomial, for p > 0, 
the maximal eigenvalue A and the second eigenvalue A 2 is given by 


A = 


p-l + a/p^ + 6p + 1 


A2 — 


p-1 - a/p^ + 6p + 1 


2 ’ 2 
The normalised left eigenvector tt that satisfies ttQ = Att gives the stationary type-distribution: 

2 A 


= = 


\ + 2 ' \ + 2 


( 2 . 1 ) 

ution: 

( 2 . 2 ) 


We denote the right (column) eigenvector of Q by u and normalize it so that ttu = 1. For 
later use, without computing, we denote by V 2 and U 2 the left (row) and right (column) 
eigenvector of Q belonging to the eigenvalue A 2 . The most important theorem for our 
purposes is that the CMBP grows exponentially with rate A (the so-called Malthusian 
parameter), more precisely, 


Theorem 2.6 ([5]). With the notation as above, almost surely, 

lim A(t)e-^* = Wtt 

where W is a nonnegative random variable, the almost sure martingale limit of Wt '.= 
A{t)ue~^^. Further, lU > 0 almost surely on the event of non-extinction. 

Theorem 2.7 ([5]). Define T^, the m*'^ split time, as the time of the death in the 
branching process. (We assume Ti = 0 for the death of the root.) On the event {W > 0}, 

(i) For each q e {R,B), limm_,.oo N,(Tm)/N(r„) = lim„_,.oo Nq(Tm)/m “=' tt, 

(ii) limm_,.oo jW 

Corollary 2.8. For the vector of dead particles N(t) = {Nuff), Nsit)), 

N(t)e-^* ^ \wn. 

A 

Proof of Theorems 2.5, 2.6, 2.1 and Corollary 2.8. The proofs can be found in [5, Chapter 
V.7]. □ 


Throughout the next sections, we develop error bounds on the coupling between the 
branching process and the exploration process on the graph. For convenience, we introduce 


t. 


:=^logn, 


(2.3) 


the times we will observe the branching and exploration processes at, as well as 

:= e-^*-A(t„), with 1U<"> ^ W, 


a.s. 


(2.4) 
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the approximations of the martingale limit W at the times Note that in our case, 
extinction can never occur, hence almost surely W > Q. 

2.4. Labeling, coupling, error terms. In this section we develop a coupling between the 
CMBP discussed in the previous section and SWT(t), the exploration process on NW„(p). 

Error bound on coupling the offspring. The CMBP is defined with Poi(p) blue offspring 

distribution, while in the exploration process a vertex has Bin(n — 3, p/n) or Bin(n — 4, p/n)) 

many blue children. Let Y ~ Poi(p) and X ~ Bin(n, p/n). By the usual coupling of binomial 

2 

and Poisson random variables, P(X ^ Y) < ^. Let Z ^ Bin(n — 3, p/n), V ^ Bin(3,p/n) 
independent. Then Z = X — V, and under the usual coupling 

^Y)< P(X ^ y) + P(P ^ 0) = ^ + ^ + o{l/n^). 

n n 

For the blue offspring of a blue vertex Z ~ Bin(n — A, p/n), by similar arguments P(Z ^ 
Y) < ^ ^ +o(l/n^) holds. Taking maximum and using union bound, the probability that 

up to k steps, at least one particle has different number of blue offspring in the exploration 
process and the Poisson branching process, is at most + Ap)/n. 

2.4.1. Labeling and thinning. We relate the CMBP to the exploration process on NW„(p) 
through the labeling of the earlier. Below, everything must be interpreted modulo n. 

(i) The root is labeled u, the source of the exploration process, u can be U, a uniformly 
chosen vertex in [n]. 

(ii) Every other particle gets a label when it is born. 

(iii) We distinguish ’’left type” and ’’right type” red children. Left type red particles have a 
left type red child, right type red particles have a right type red child, blue particles 
have a red child of both types. 

(iv) A left type red child of v gets label v — 1, a right type red child of v is labeled v + 1. 

(v) The blue children of v get a set of labels uniformly chosen from [n]. 

Lemma 2.9. We say that the labeling fails if two explored vertices share the same label 
(this still allows for several occurrences of the same label in the active set). The probability 
that the labeling fails at the split is at most 2i/n. 

Proof. The labeling fails at the split if the splitting particle has a label that is already 
taken by an explored vertex. We distinguish two cases. 

When a blue particle splits. Since the label of a blue particle is chosen uniformly in [n], and 
there are at most i — 1 dead labels already, the probability that we choose from this set is 
(i- l)/n. 

When a red particle splits. Note that the labeling procedure ensures that whenever a blue 
particle v is explored, it starts a growing (possibly asymmetric) red interval of red vertices 
around it. A red vertex, upon dying, extends this interval in one direction (if it is left type, 
then towards the left). Note that the original vertex v in this interval had a uniformly chosen 
label in [n]. Let us denote the position of the explored blue vertex by Ck, and write Ikiff) 
and rk{Ti) for the number of explored red vertices to the left and to the right of Ck after the 
split, i > k. Finally, we denote the whole interval of explored vertices around Ck after the 
z*^ split by Ik{Ti). Recall that the process is by definition right-continuous. 

In this setting, the label of a red vertex that is just being explored can coincide with the 
label of an already explored red vertex if and only if two intervals ‘grow into each other’ at 
the z*^ split. Denote by I* the interval that grows at the z*'' split, write c*,r*{Ti_i), l*{Ti_i) 
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for the location of its blue vertex, right and left length, respectively. Then, I* grows into 
another interval Ik if and only if c^, the location of the blue vertex in is at position 
c* — r(Ti_i) — rkiTi-i) — 1 or is at position c* + r*(Ti_i) + lk{Ti-i) + 1. (The first case 
means that the furthest explored red vertex on the right of Ik was a red active child of the 
furthest explored left vertex in I*). Since the location of Ck is uniform in [n], 

p(r(r,_i)n4(T,_i) = 0,r(ron/fc(T,) ^0) = 

n 

Note that there are exactly as many intervals as blue explored vertices (at either Ti_i or Ti, 
since the explored vertex Vi must be red). Let the event Ei = {vi is red and its label is 
already used}. Hence, 


Nb(T._i)-1 

nE.) < E „ 


^ (Nb(T,_i) - 1) < 

n 


2i 

n 


since there are at most i blue explored vertices. Note that the proof also applies when the 
new red explored vertex coincides with a formerly explored blue one, in case IkiTi-i) = 0 or 
rk{Ti-i) = 0. Hence, the statement of the lemma follows. □ 


In NW„(p), the shortest path {u,v) through x necessarily uses the shortest path between 
(u, x). As a result, in the CMBP, we also do not need later occurrences of the label x. Hence, 
we mark the second (or any later) occurrence of a label thinned^ and all its descendants 
ghosts. We move towards bounding the proportion of ghosts among active individuals to 
carry on with the CMBP approximation. To determine whether a vertex is a ghost, we need 
knowledge about its ancestors. 


2.4.2. Ancestral line. We approach the problem of ghost actives with the help of the ancestral 
line. We define the ancestral line AL(w) of a vertex w as the chain of particles leading to 
w from the root, including the root and w itself. Then an alive particle is a ghost if and 
only if at least one of its ancestors is thinned. The ancestral line was introduced by Biihler 
in [15, 16] with the following observation: for each time interval [Tk,Tk+i) we can allocate 
a unique particle on the ancestral line that was active in the interval [Tk,Tk+i). For the 
following observations, we condition on {Di,i = 1,..., fc}, where is the total number of 
offspring of the splitting particle. Denote by Gk the generation of a uniformly chosen alive 
(active) particle W after the split. Then Gk = Li + L 2 + ... + Lk, where the indicators 
Li are conditionally independent and = 1 if and only if the ancestor of W that was alive 
in the time interval [Ti,Ti^i) was newborn (born at Ti). (A rewording of the indicators Li 
is as follows: = 1 if and only if the ***' splitting particle is in AL(IT).) 

Since W is chosen uniformly, and at each split the individual to split is also chosen uniformly 
among the currently active individuals, each one of these active individuals is equally likely 
to be an ancestor of V. Further, in the interval [Ti,Ti^i), Di many particles are newborn, 
and Si many are alive, which yields the probability V{Li = l\Di, i = 1, ..., k) = Di/Si, see 
the discussion at the beginning of [16, Section 2.A]. We arrive to the following corollary: 

Corollary 2.10. The probability of the dying particle being an ancestor ofW, a uniformly 
chosen active vertex after the split: 

P(u, G AL(W)|A,* = l,...,fc) =P(A = 1) = 
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Expected proportion of thinned actives. Let us combine Corollary 2.10 and Lemma 2.9. To 
be able to do so, we need the following lemma. We will provide its proof later on. 


Lemma 2.11. For every e > 0, there exists a positive integer-valued random variable 
K = K{e) so that K is always finite and for every i > K, Si = zA(l + holds. 

Recall that tn = (logn)/2A, and it was chosen such that the number of active vertices is 
of order ^/n, and that A{t), A{t),J\f {t),N{t) denotes the set and number of active and dead 
individuals in the CMBP at time t, respectively. 


Lemma 2.12. Let Acit) = {in S A{t) : w is a ghost} the set of ghost active vertices at 
time t and Ac{t) its size. For every fixed s G K, the proportion Ac{tn + s)/A(t„ + s) tends 
to 0 in probability as n tends to infinity. 


Proof of Lemma 2.12. The proportion AQ{t)/A{t) = ¥{W € Aa(t)), where W is uniform 
over A(t), i.e., uniformly chosen active individual. Recall that Vi is the particle that dies at 
Ti. For an event E, let us write Fk{E) := F{E\Di, i = 1,..., fc). Using these notation and 
Corollary 2.10 for the representation of the ancestral line of U G A{t), we can write 

N(t) 

FkiW G Acit)) < ^Pfe(i’i G AL(IT) and Vi is thinned), 

i=l 


Since the labeling is independent of the family tree, 

N(t) 


N(t) 


Fk{W G Acit)) < ^ P/c {vi G AL(IT)) • P/c {vi is thinned) < ^ ^ —. (2.5) 

2=1 2=1 

We apply Lemma 2.11 by splitting the sum for parts up to K and above, use Di < Si for 
i < K: 


K 


P(WGAlG(t))<E-+ E 


N(t) 


D, 


< 


i=l 

n 


i=K+l 

,N(t) 


A(1 + o(i“i/2+=))7 


r, A(t) + N(t) 

An n An 


where we used that all particles are either active or dead in the process and with a possible 
modification of K, we can have (1 + >1/2 for all i > K. Next, we can use 

Corollary 2.8 and Theorem 2.6, which gives that N(t) + A(t) = + 1)1U^"'(1 + o(l)). 

Hence 


Acitn + s) / A(tn + s) — P(1F G Aoitn + s)) < j Tl + 


I A + 1 
yfin 


W<">(l + o(l)). 


Setting tn = logn/(2A), the right hand side tends to 0 as n —>■ oo, since IT*"' —>■ W and K 
is a tight random variable (does not depend on n). □ 


Let us now return to the proof of Lemma 2.11. This lemma follows from [4, Theorem 
1, Theorem 2]. Here, we restate [4, Theorem 1] using our notations and for a special case, 
where each eigenvalue has multiplicity 1. This is sufficient for our purposes and easier than 
the general case. 

Theorem 2.13 (Asmussen, [4]). Let Z„ be the number of individuals in the generation 
of a (discreet time) supercritical multi type Galton- Watson process, with dominant eigenvalue 
X, the corresponding left and right eigenvector v and u. For any other eigenvalue v, and 
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denote the left and right eigenvectors belonging to v. 

For an arbitrary vector a € with the property v • a = 0 define 

r 2 r |v|Var(Z„a) 

:= sup{z/ : Vi,a 0}, a := lim -—- (2.6) 

a"' 

If gf < \, then with Cn = (2cr^Z„ulogn)^/^ 

lim inf —^ = — 1 and lim sup —^ = 1. 

n—>-oo Cfi n—^oo Cn 

We also restate [4, Theorem 2] without change. 

Theorem 2.14 (Asmussen 2., [4]). Replacing Z„ with A{t),t € [0, oo), Theorem 2.13 
remains valid for any supercritical irreducible multi-type Markov branching process. 

Proof of Lemma 2.11. We use the previous two theorems for the 2-type branching process 
defined in Section 2.3. Since tt and V 2 are linearly independent, for any a ^ (0,0) with 
Tra = 0, necessarily V 2 a 0, which implies /j, = A 2 in (2.6). The eigenvalues of the mean 
matrix M(t) are e^‘ and e'’'^*. The condition < A in Theorem 2.13 is then equivalent to 
2 A 2 < A which follows from the nonnegativity of p, through simple algebraic computations, 
see (2.1). The asymptotic variance and C* in this case becomes: 

= lim 7rVar(A(t)a)e“^*, Ct = (2CT^A(t)ulogt)^/^ 

t—>-oo 

This implies that the theorem rewrites to 

lim sup = 1 and lim inf = -1. 

Applying this for the split time T^, we get that there is only a finite number of indexes i such 
that \A{Ti)si/CTi \ > 2. Let the maximum of these indexes be K, a random variable. Since 
Ti — logf/A has an almost sure limit by Theorem 2.7, Ti is of order logi. This implies that 
Cxi is of order (i log log*) and by definition of the almost sure convergence, Ct^ exceeds 
ji/2-i-E only finitely many times for every e > 0. 

Since E[A(t)a] = 0 if and only if Tra = 0, we can apply the theorem for the centered 
version Sf := Si — ES'i. Then for i > K, ISf] < Cp.. The fluctuation is of smaller order then 
Si; itself, which means we can indeed write Si = *A(1 -I- For more detail on this, 

see the proof of [25, Corollary 3.16]. □ 

2.4.3. The number of multiple active and active-explored labels. Recall that both in the 
exploration process as well as in the branching process there might be multiple occurrences 
of active vertices, see Remark 2.1, as thinning only prevents multiple explored labels. Later 
we want to use that the number of different active labels that are not ghossts at R is 
approximately the same as Si, i.e., there are not many multiple occurrences. In Lemma 2.12 
we have seen that the proportion of ghosts is negligible on the time scale but we still 
have to deal with labels that are multiply active, or are explored and active at the same 
time. We will discuss these issues in the following five cases: 

1. A blue active vertex has been already explored. 

2. A red active vertex has been already explored. 

3. A blue active vertex is also red active. 

4. A vertex is double red active. 

5. A vertex is double blue active. 
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Figure 2. We indicated the growing neighbourhood of a uniformly picked 
vertex in the exploration process. Exclamation marks indicate ‘bad events’ 
for the coupling to a branching process: the vertices at the endpoint of edges 
(indicated along the edge) with two blue exclamation marks are vertices 
that are blue active and have been already explored as well. The vertex 
with two red and one blue exclamation mark is twice red active and once 
blue active at the same time. 


We will denote by pa (t) the probability that a uniformly chosen active vertex falls under 
case a = 1,..., 5 at time t, which is the same as the proportion of vertices falling under case 
a among all active vertices. 

Case 1. Blue active being already explored. At time t, there are at most N(t) explored labels 
that are not thinned. Under the condition that the active vertex is blue, its label is chosen 
uniformly over [n] , so the probability that this label has been already explored is at most 
N(t)/n. Substitute N(<„ + s) from Corollary 2.8, then for <„ + s = ^ logn + s, 

Pi{tn + s) < P(u is already explored | v is blue) = N(t„ + s)/n. (2.7) 


Case 2. Red active being already explored. This case can be treated similarly as the thinning 
of red vertices, so we also use the notation there. A label of the red active vertex is explored 
if and only if two intervals are about to grow into each other: the furthest explored red 
vertices in both intervals are neighbors. We call these intervals neighbors. Then, for two 
neighboring intervals, the active vertices at the end of each interval are explored in the 
other interval. Let Ik and Ij,l < k < j < Nsit) intervals with blue particles with label 
Cfc and Cj respectively. Conditioned on Ck,lkit),rj{t), there are two possibilities so that Ik 
and Ij are neighbors: cj = Ci + Vi + Ij + 1 or Cj = Ci — U — rj — 1. Thus for each pair of 
indices the probability of the intervals being neighbors is 2/n (these are not independent, 
but expectation is linear). Summing up for all pair of indexes and dividing by the number of 
all red actives gives the proportion of case 2 red actives among all red actives. 


P2{tn + s) < 


1 

^sitn + s) 


E 

l<i<j<NB(t„+s) 


2 

n 


. 2/n ^ Neitn + s) 
2NB(t„ + s) ~ 2n 


( 2 . 8 ) 


Case 3. Blue active being red active. Using that the labels of blue vertices are chosen 
uniformly. 


P(u is red and blue active) = Aj^(t„ + s)/n. 


(2.9) 
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Case 4. Multiple red active vertices. This case is similar to Case 2. A vertex v can be red 
active twice if the two intervals that it belongs to are ’’almost neighbors”, that is, both 
have V as an active vertex on one of their ends, (v is the only vertex separating them.) 
Conditioning on the location of one of the intervals, the blue vertex in the other interval can 
be at 2 different locations, hence 

P 4 {tn + s) =P2{tn + s). (2.10) 

Case 5. Multiple blue active vertices. Again, the label of a blue vertex is chosen uniformly 
at random, hence the probability that the label of an active blue vertex v coincides with 
another active blue label is at most AB{t)/n. Hence 

P5{tn + s)<- --r ^ Afi + s)/n = Afi (t„ + s)/n. (2.11) 

Corollary 2.15. Define Ae(t) the effective size of the active set as follows: we subtract from 
A(t) the number of ghosts, already explored and multiple active labels, to get the number of 
different labels in A{t). Then 

Aeifn + s)/A[tn + s) -1. 


Proof. By the previous arguments, a lower bound can be given if we subtract the individual 
probabilities for red and blue vertices to be deleted (note that this is a crude bound since we 
do not weight it with the proportion of red and blue active labels): 


Ae{tn + s)/A[tn + s) > 1 


5 

^ '. Piifn As) — 1 
i=l 


2N(t„ + s) + A{tn + s) 
n 


where we summed up the rhs of (2.7), (2.8), (2.9), (2.10), (2.11) to obtain the rhs. Now we 
can use that + s = logn/(2A) + s, use N(t) from Corollary 2.8, and Theorem 2.6 to get 

Afftn + s)/A{tn + s) > 1 - ^ + o{l))/y/n, 

which tends to 1 since —>■ W a.s. by (2.4) and Theorem 2.6. □ 


The conclusion of this section is summarized in the following corollary. 

Corollary 2.16. Fix n > 1 and p > 0. Consider the thinned CMBP with label u for the 
root. Then, there is a coupling of shortest weight tree SWT“(t) in NW„(p) to the evolution 
of the thinned CMBP as long as t < t^ + M for some arbitrary large M S M. Further, the 
set of active vertices in the thinned CMBP can be approximated by the set of labeled active 
vertices in the the original CMBP in the sense that the proportion of the different labels 
among the actives over the total active vertices tends to zero as n ^ oo, in the sense of 
Corollary 2.15. 


3. Connection process 

Now that we have a good approximation of the shortest weight tree (SWT) started from 
a vertex, it provides us a method to observe the shortest weight path between two vertices. 
Let us give a raw sketch of this method before moving into the details. The previous section 
provides us with a coupling of a CMBP and the SWT as long as the total number of vertices 
is of order ^/n in the SWT. To find the shortest weight path between vertices U and V, we 
grow the shortest weight trees from one of the vertices (SWT^) until time (the size is 
then of order ffn). Then, conditioned on the presence of SWT^(t„), we grow SWT^ and see 
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when is the first time that these two trees intersect. The shortest weight path is determined 
by the first intersection of the explored set of vertices in the two processes. Note that a 
vertex w in the active set of vertices in SWT^{tn) is at distance + Rw{tn) from U. Since 
we have a good bound on the effective size of the set of active vertices, it turns out to be 
easier to look at the times when the first few active vertices in SWT^(t„) become explored 
in SWT'^, and then minimize over the total distance of the path formed by connecting U 
and V via this vertex. This is what we shall carry out now rigorously. 

Definition 3.1 (Collision and connection). We grow SWT^, the shortest weight tree of 
U until time tn = ^ log n, and then fix it. Then we grow SWT^ until time + M, for 
some large M £ K, conditioned on the presence o/SWT^(t„). We say that a collision 
happens at time tn + s when an active vertex in SWT^(t„) becomes explored in SWT^ at 
time tn + s. Denote the set of collision times by the point process (fn + If o collision 

happens at vertex Xi at time tn + Pi, this determines a path between U and V with length 
2tn + Pi + TZ^fitn), where Tiffifitn) is the remaining lifetime of Xi in SWT^(t„). Then the 
length of the shortest weight path is given by 

min (2t„ + P* + TZ^. {tn + Pi)) 

iGN ^ // 

among all collision events. 



Figure 3. We indicated the growing neighbourhood of two uniformly 
picked vertices in the exploration process, with purple and yellow colors, 
respectively. The letters ‘C’ on the picture show that a collision event 
happens at the given vertex. Notice that all these collision events have a 
remaining edge-length yet to be covered in the exploration process of U, i.e., 
the vertex is active in SWT^ and explored in SWT^. 

We can see that in the case of growing SWT'^ after SWT^, the labels belonging to 
explored vertices in SWT^ can not be used again, leading to some extra thinned vertices in 
SWT'^. We claim that the number of additional ghosts is not too big. (Since we would like 
to get a bound on the effective size of active vertices in SWT^ {tn -k u), we must delete the 
descendants of vertices that formed earlier collision events.) 

^We will see later that a.s. there is a first collision time. Hence, indexing by i S N makes sense. 
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Claim 3.2. Consider the case of growing SWT^ after SWT^(t„) on the same graph NW„(p). 
Then the effective size of the active set in SWT'^ for times t + s is asymptotically the 
same as the size of the active set, that is, the statement of Corollary 2.15 remains valid for 
SWT'^ as well. 

Note that it suffices to bound the proportion of ghosts, as the error terms caused by 
multiple active, or active-explored vertices are not increased by the presence of SWT^. 


Proof. We consider the computations in the proof of Lemma 2.12, using (2.5). Recall that the 
proportion of ghosts depends simultaneously on the thinning probability of the ***' explored 
vertex as well as it being an ancestor of a uniform active vertex. 

The arguments with the ancestral line (see 2.4.2) remain valid without any modification, 
we only have to examine the change in the thinning probability. 

In case the i^^ explored vertex is blue, its label is chosen uniformly, thus the probability 
that this label coincides with a previously chosen label equals (N^(t„) -|- i — l) /n. In case 
the explored vertex in SWT^ is red, we can use the same idea as before: it has the 
label of an already explored vertex if and only if two intervals grow into each other with 
the step. We now consider the union of the intervals in SWT^ and SWT'^. Conditioned 
on the interval that grows, for any interval the probability that these two grow into each 
other is 2/n. The number of intervals is at most the total number of blue explored vertices, 
+ N]^(Ti). Hence the probability that the labeling fails at the ***' step of SWT'^, if 
this is a red vertex is at most 

^ 2(N^(t„)+jV^(r,)) ^ 2(N^(f„)+i) 

^ n n ~ n 

k=l 

Since the color of the i-th explored vertex is either blue or red, we get 

P( labeling fails in SWT^ at step i ) < 2 (N^(t„) -|- 1 ) /n. 

For the probability of a uniformly chosen active vertex in SWT'^ being a ghost, similarly to 
(2.5), we have 


+ s) 
^ {tn -\- s) 


E 


2=1 


A 2(N^(t„) + z) 
n 


N''(t„+s) 

E 




A 


2i 

n 


E 


i=l 


A 

S^ 


-N^(t„). (3.1) 
n 


By Lemma 2.12, the first sum on the rhs tends to 0 as n tends to oo. for the second sum, let 
us recall the a.s. finite K in Lemma 2.11, and we split the sum again. We use Corollary 2.8, 
2.4 and = logn/(2A) to get 


^ n) E W ^ E 1 0 . 


i=l ‘ i=l 

For the second part of the sum, by Lemma 2.11 again. 


X\/n 


A = -N^(t„) ^ A/5. = -N^(t„) ^ , r.^/2+ey^ 

n n iX{l + o{i i/2+e)) 
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Using E[Z?i] < A+2, we bound the expected value of the sum E 3 := iA(i+o(?-i/ 2 +e)) 

with tower rule. 


E[E 3 ]< e 




(1 + 0(1)) E 


i=K+l 


A + 2 
iX 


A + 2_ 

< ^^E 


log(N'^(t„ + s)) (1 + 0(1)) 


Since the logarithm is concave, we use Jensen’s inequality: 


EP3] < 


A + 2 
A 


logE N^(t„ + s) (l + o(l)) 


(3.2) 


From Theorem 2.5 it follows that E[N'^(t„ + s)] = (0,1) exp{(5-(tn + s)}lj where 1 = (1,1)^ 
is a column vector. Using the Jordan decomposition of the matrix Q and exponentiating, 
elementary matrix analysis yields that the leading order is determined by the main eigenvalue 
A and hence (1, 0) exp{(5(t„ + s)}l < for some constant Ci > 0. Let us then use 

this bound with + s = log n/ (2A) + s to give an upper bound on the rhs of (3.2), and set 
C 2 ■= 2(71 (A + 2). Then Markov’s inequality yields 


E (E 3 > ( 72 (log n)^) < 1 / logn TzL)? q. 

Then on the w.h.p. event {S 3 < ( 72 (logn)^} for S 2 = ^N^(t„)S 3 , by Corollary 2.8, (2.4) 
again, 

E2 < ^N^(t„)C2(logn)2 < C2(logn)2e^“^i^i2M 0. 

n A+n 

□ 


3.1. The Poisson point process of collisions. Recall that we say that a collision event 
happens at time tn + s when an active vertex in SWT*^(t„) becomes explored in SWT'^ at 
time tn + s. First we show that for each pair of colours, with respect to the parameter s in 
tn + s, the set of points (Pi)igN form a non-homogeneous Poisson point process (PPP) on M, 
and that these PPP-s are asymptotically independent. We consider the intensity measure 
/r(dt), t S K as the derivative of the mean function M{t) (expected value of points up till 
time t). To determine the intensity measure of the collision process, we will consider the four 
collision point processes for each possible pair of colours. None of the PPPs is empty: since 
the labels of blue vertices are chosen uniformly, they can meet any color, and considering 
the growing set of intervals, we see that red can meet red as well. 

Let us introduce the notation for q,r G {R, B}, s G K 

Cq,r{s) ■=Nq {tn+ s)nA^{tn), Cg^r(s) := |(^9,r(s)|, 

C(s):= U Cq,r(s), C(s):=|C(s)| (3-3) 

q,re{-R.,B} 

(Note that e.g. Ci^,B(s) denotes the set of red explored labels in SWT^ that are blue active 
in SWT^.) The corresponding intensity measures are denoted by fiq^ris) and total intensity 
measure by /i(s). 

In Corollary 2.15 we showed that the effective size of the active set at times + s for 
s G K is asymptotically the same as the number of active individuals in the CMBP, so we can 
use the asymptotics of {A]i(t),AB(t)) from Theorem 2.6. We shall handle the cases when 
blue vertices are involved in a similar fashion. For the coming sections, for s G M and and 
event E we define the notation Vs(E) := P(U|A))(<„), A^i?(t„),N))(t„ + s),N){(t„ + s)). 
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Blue-blue eollision. By the definition of the set Cb,b{s), we can use the following indicator 
representation: 

Cbm{s) = ^ l{x G A^{tn)}- (3.4) 

(tn+s) 

Recall that the labels in A^{tn) are chosen uniformly in [n], hence for any label x G [n], 

P.(a: G = 1 - (1 - l/n)AS(t.) = Ag(f„)/n(l + o(l)). (3.5) 

Further, since A))(<„) = 0{^/n), elementary calculation using inclusion-exclusion formula 
shows that the indicators in (3.4) are weakly dependent in the sense that Fg{y G A%{tn)\x G 
ABitn)) = A^{tn)/n{l o(l)). Further, the events {x G + s)},{a: G A^{tn)} are 

independent, the usual Poisson approximation for weakly dependent random variables (e.g. 
via factorial moments, as in [22, Theorems 2.4, 2.5]) yields that Cb,b{s) converges to a 
Poisson variable for each s G K, and by Wald’s equation, 

IEs[C_b,_b(s)] = + u) ■ A.)j(t„)/n(l -|- o(l)). 

By noting that l{a: G + Si)} < l{a: G Ng{tn + S 2 )} when si < S 2 , we also get 

that C_b,b(si) < Gb,b{s 2 ) and hence by standard methods one can show that the process 
(Cb 3 (s))sgr converges to a Poisson point process in K. 

Red-blue collision. Using the same argument as for the previous case, 

Cr,b(s) = 'y ] l{a: G A^{tn)}, E,;[Cb,b(s)] = N]^(t„ -|- s) ■ A^(t„)(l -|- o(l)). 

(in+s) 

Further, for y G (tn+s) and x G (t„-|-s), the indicators l{z G A^{tn)} for z = x,y are 
only weakly dependent, we can also conclude that the process (Cb_b(s))sgk is asymptotically 
independent of (Cb,b(s))sgk- 

Blue-red collision. This time we approach the number of collision events through indicators 
of labels in the explored set, i.e., 

Cb,r{s) = A^b {tn + s) n Cb,r(u) = y ] l{a; G {tn + s)}. (3.6) 

a;GAg(t„) 

The blue labels in Af^ [tn + s) are chosen uniformly in [n], hence, a similar calculation as in 
(3.5), and the independence of the labeling in Af^itn -f s) and in A^{tn) yields again 

IEs[Cb,,b(s)] = A^{tn) ■ -f s)/n(l -|- 0 ( 1 )). 

The independence of the three processes (Cb,b(s), Cb,b(s), (Cb.b)^))^^^ can be seen by 
‘reversing’ the indicators in (3.4) to get a similar form as the one in (3.6), and noting that 
the variables included are again only weakly dependent. 

Using the asymptotic results for A(t),N(t) (from Theorem 2.6 and Corollary 2.8) and 
the definition of in (2.4), then differentiating with respect to s yields 

l^B,B{ds) = 4w(;>W^">e^Ms(l + o(l)), 

Mfl,B(ds) = 7rB7rBfU(;’fU(r'e^Ms(f -k o(l)), (3.7) 

MB,fi(ds) = 7rB7rBW^"’lU^"’e^Ms(f -k o(l)), 

where the 1 -k o(l) factor only depends on n and comes from the error of the approximation 
Ng{t) = e^‘A“^fU'"*7rq(l -k o(l)). It is not hard to show (e.g. using Borel-Cantelli lemma) 
that these PPP-s have only finitely many points on (—oo,0), hence, indexing the points by 
i G N is doable. 


FPP ON THE NEWMAN WATTS MODEL 


19 


Red-red collision. The red-red collision events have to be treated slightly differently, since 
here the geometry of the cycle plays a more important role. Recall that we stopped the 
evolution of SWT*^ at time Hence, we consider SWT^(t„) as a fixed set of intervals, 
{4,fc = l,...,Ng(t„)} ( some of them might have already possibly merged by time t„), while 
the set of intervals {Ji(s),i = 1, ...,N]^(4 -|- s)} in SWT^(t„ -|- s) is growing (and possibly 
merging) with s. A collision happens when one of the intervals Ji{s) grows into one of the 
intervals 4. 

Note that here we face a new technical issue. When two intervals Ji{s) and 4 collide at 
time s, in principle we should stop the evolution of Ji{s), that is, for all s' > s we should 
have Ji(s') = Ji{s). But, since this would cause computational difficulties later, (since then 
we have to condition on all the earlier collisions to be able to calculate the intensity of the 
next one). Hence, it is easier to do the following approximation on the number of red-red 
collisions: we let Ji{s) grow further and it might collide with more vertices inside 4- The 
error terms caused by such events is negligible, since such events have been already treated 
when we investigated the ‘extra’ thinning of SWT^ imposed by SWT^(4), that had a 
negligible contribution in the sense of Claim 3.2. 

We decompose the number of collisions as a sum of indicators. Recall the notation from 
paragraph 2.4.1: Cfc,4,Tfc stands for the location of the blue vertex, the number of red 
explored to the left and right from Ck in the interval 4- We adapt the same notation for the 
intervals by adding a superscript V to the above quantities. We claim that for any pair 
4 and Ji{s), the probability that they had already made a red-red collision by time 4 + s is 

{ Ji(s) r\Ik ^ 0 via red-red collision} = = Ck — h — ^ — nr} 

UIjcf = Cfc-F Tfc-h 1-f to}. 

To see this, we condition on Ck,rk,lk- Note that there is two active red vertices at the 
boundary of Ik- one on the left, at location Ck — Ik — ^ and one on the right, at location 
Cfc + nfc -I- 1. When the one at location Ck — Ik — ^ coincides with any of the rY (s) explored 
right-type red vertices of Ji(s), a (potentially past) collision has happened. Each of these 
events uniquely determines the position of cY , the location of the blue vertex of Ji{s), forming 
an interval of length rY (s) for potential locations. We remark that this left active vertex at 
location Ck — Ik — ^ can not coincide with a left-type red vertex in Ji{s)., since that would 
mean that either cY also coincides with Ik - causing an event that we already counted in 
Claim 3.2, or, cY is to the left of the whole interval 4j and lY(s) is already so large that it 
swallowed the whole interval Ik, and this case also have been already treated in Claim 3.2. 
(Since in this case, the part of Ji(s) that the overlaps with 4 has been already thinned.) 
Similarly, there are lY (s) many positions for cY on the left of Ck that cause a valid collision 
event. Using that cY is chosen uniformly over [n] - since it is a blue vertex - yields the right 
hand side of (3.8). Note that these indicators are mutually exclusive, and hence 

IPs(Ti(s) (1 Ik 0 via red-red collision) = {lY (s) -I- rY{s))/n. (3.9) 

The number of red-red collisions can be obtained by summing up the right hand side of 
(3.8) for all intervals 4 and Ji{s). This yields a collection of indicators that have a weak 
dependence structure: each indicators only depends on at most max^ |4(s)| many other 
indicators. Since t = O(logn), and the length of each interval is determined by consecutive 
exponential variables (i.e. the time to reach the A:-th red vertex on the right of a blue vertex 
has a Gamma(/c, 1) distribution), it is elementary to show that max^ | Ji(s)| = o((logn)^) 
asymptotically almost surely. As a result, a Poisson approximation can be carried out by 
using e.g. the factorial moments of these indicators, in a similar fashion that the one in [22, 
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Proposition 7.12], using [22, Theorems 2.4, 2.5]. We leave the details to the reader. We get 
that converges to a PPP with intensity measure given by 

Ng(t„)N^(t„+s) 

E,[Cn.,R{^)]= E E iin-^)+rris))/n. 

k—1 i—1 

Note that the inner sum is simply + s), while = A^(t„)/2, since there are two 

red actives in each interval Ik- Then, 

Es [C_R^_fj(s)] = A^(tn) • N^{tn + s)/ (2n). 

Further, similar arguments as before show that this PPP is asymptotically independent of 
Cb,b(s), Cfl,B(s), and we also obtain 

/^B.B(ds) = + o(l)). (3.10) 

We emphasize that to get (3.7) and (3.10) we assumed that the number of actual intervals in 
SWT^(t„) and SWT^(t„ + s) is Ng{tn) and Ng{tn + s), respectively. This is not entirely 
true due to the fact that intervals within SWT^ or within SWT'^ might have merged already 
earlier. However, in this case some of the included vertices are ghosts: Corollary 2.15 showes 
that the effective size of the active set at times + s for s G M is asymptotically the same 
as the number of active individuals in the CMBP, which, by the fact that every interval has 
precisely two active red vertices, implies that also the number of disjoint active intervals is 
asymptotically the same as Ng{tn) and + s), respectively in the two processes. We 

have arrived to the following theorem: 

Theorem 3.3 (Total intensity measure of the collision PPP). The total intensity measure 
of the collision Poisson point process is 

p{ds) = + 27TB7TR + 4/2) = W4w4e^^(l - 4/2)ds(l+o(l)), (3.11) 

where the factor (1 + o(l)) only depends on n. 

Proof. We have calculated the individual intensity measures of the four Poisson point 
processes in (3.7) and (3.10). These processes are asymptotically independent, since the 
total collection of all the indicator variables included in their construction are only weakly 
dependent. One way then is to show that the joint factorial moments converge, in a similar 
manner than the one in [22, Proposition 7.12]. Or, one might also consider the point process 
of the total intensities immediately together to get (3.11). It is important to observe that the 
depletion of the number of active individuals due to earlier collisions does not influence the 
asymptotic intensities, since the first is of constant order on the time scale t + s, s G K, 
while the latter is of order -^n, the size of the active/explored labels in each of the colors. 

□ 

3.2. Proof of Theorem 1.1. It is well known [32] that if is a collection of i.i.d. 

random variables with distribution function FE{y)^ and the points (Pi)igN form a one¬ 
dimensional Poisson point process with intensity measure /.t(ds) on K, then the points 
(Pi,i?j)igN form a two-dimensional non-homogeneous Poisson point process on K x K, with 
intensity measure /i(ds) x FE{dy). 

In our case, to get the shortest path between U and V, recall from Definition 3.1 that 
we have to minimize the sum of time of the collision and the remaining lifetimes over the 
collision events. Mathematically, we want to minimize the quantity Pi + Fi over all points 
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Figure 4. An illustration of the two dimensional PPP with points 
(Pi, Pi)igN, with the point minimizing the sum Pi + Ei indicated. 

{Pi,Ei) of the two dimensional PPP with intensity measure u(ds x dy) := y,{ds) x e^dy, 
since the remaining lifetimes are i.i.d. exponential random variables. 

Note that event {mini Pi+ Ei > z} is equivalent to the event that there is no point in the 
infinite triangle A(z) = {{x,y) : y > Q,x + y < z} in this two-dimensional PPP. We calculate 

u(A(z)) = ' M(ds) • e-ydy = (^1 - ^ 4 ) + o{l))- 

For short, we introduce 

W'"> := (^1 - ^ 4 ) a(ATT) 

Then we can reformulate 

i/(A(z)) = (3.13) 

Let us turn our attention back to Vn{U, V), the shortest weight path between U and V. By 
the previous argument, we conclude 

P(P„(P, V)>z + = P(Poi(u(A(z))) = 0|W'">) = exp{-u(A(z))} (3.14) 

Rearranging the left hand side and substituting the computed value of p(A(z)), we get 

P(2t„ - P„(P, V) < -zlW'"’) = exp{-W<"’e^^} 

We substitute = logn/(2A), and set z := —x/X to get 

P (logn — XVn{U, V) < x|yV'"’) = exp{— exp{—(x — logW^’*')}} 

We recognize on the right hand side the cumulative distribution function of a shifted Gumbel 
random variable, which implies 

(logn-APjv(P,F))|W''*’ = A-MogW'">, 

Rearranging and substituting from (3.12), and using that the martingales (W^ ’, Wy'’) — 
{Wu,Wv), 

Vn{U,V) - i logn 4>= - ^\og{WuWv) - ^ log ^1 - ^4^ + ^log(A(A + 1)). 

This finishes the proof of Theorem 1.1. 
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4. Epidemic curve 


Recall the definition of the epidemic curve function from Section 1.2. The discussion of 
the epidemic curve consists of three parts: first, we find the correct function / by computing 
the first moment of I„(t, U). Then we prove the convergence in probability by bounding 
the second moment. Finally, we give a characterization of Mwv^ the moment generating 
function of the random variable Wy, that determines the epidemic curve function /. 


4.1. First moment. First, we condition on the value ^ from the martingale approxi¬ 
mation of the branching process of the uniformly chosen vertex U. Then we can express 
the fraction of infected individuals as a sum of indicators, and calculate its conditional 
expectation: 

E [l„(t, U) IWjj'’ ] = — ^ P (in is infected by time t ) ■ 

[n] 


Note that the rhs equals the probability of a uniformly chosen vertex, which we shall denote 
by V, being infected. Also note that a vertex is infected if and only if its distance from U is 
shorter than the time passed, hence 


E [I„(t, U) \Wlj^ ] = P {Vn{U, F) <t\Wl^^). 


(4.1) 


Now we can further condition on Wy^ and use the distribution of Vn{U, V) conditioned on 
Wlj\Wy'’. Recall from Subsection 3.2, that 

p (VniU, V)>z + 2t„\= exp (l - ^4) . (4.2) 

Let us set here z = t— ^ log and rearrange, yielding 

p {Vn{U, V)<t-{log i log n I IF';’ , 1F<;’ ) 

= 1 - exp {-IF.^"’ (1 - 14) 

Then, from (4.1) and (4.3) we get 


E 


ut - { loglF<;’ + i logn, U)\wp] = 1 - E[exp {-IF^ (l - ^4) 


(4-3) 

We recognize that the second term on the right hand side is the moment generating function 


■ ■ (a:), at x{t) = - (l - 14) 
Changing variables yields 


of if;"’, m^(„) 


E[I„(t -b y logn, C/)|1F;"’] = 1 - {x{t + 4 log 1F;"’)) . 

Note that IF;"’ converges to Wy almost surely, which implies their moment generating 
functions converge in probability. This function is exactly the one given in Theorem 1.3. 


4.2. Second moment. The first moment above showed that the expected value of I„(t, U) 
indeed converges in probability to the defined / function at the given point. We prove 
Theorem 1.3 by showing that the variance of I„(<, 17) converges to 0, then Chebyshev’s 
inequality yields that I„(t, U) converges to its expectation in probability. 

Denote by = l{i is infected by time t}. Let us calculate 

Var(i^l,|lF'">) = ^^Var(l,|lF<"’) + A ^ Cov[l„ 1, |1F<"’] 

iG[n] i£[n] i<j£[n] 
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Since is an indicator, Var(li|W^"') < 1, hence the first term on the rhs is at most As 
for the second term, 

cov[i„ = E[ia,K">] - 

= P(i and j are both are infected— P(i is infected] fF^"’)P(j is infected] 

Imagine now three exploration processes on NW„, one from {7, one from i and one 
from j. It is not hard to see that the three exploration processes from these three 
vertices can be approximated by three independent branching processes. This implies 
that the covariance can be bounded by the error of coupling between the graph and 
the branching processes, as well as the thinning inside one tree and between the trees: 
these all have error terms of order at most 1/logn. It is not hard to see that the 
coupling can be extended to three SWT’s (instead of two, as before), and the error 
terms increase only by constant multiples. The connection processes between SWT^ and 
the other two are related only through the intersection of SWT* and SWT-^, which is 
again at most of order 1/logn. As a result, then P(i and j are both are infected]IT//*) — 
P(j is infected]IT//*)P(j is infected] W//*) = 0(1/logn). 

This coupling works if i and j are fairly apart, say (i — j) mod n > (logn)*^+® for any 
£ > 0 (this is w.h.p. longer than the length of the longest red interval). The number of ’’bad 
pairs”, which are closer than this is n(logn)*^+^/2, compared to the number of all pairs ( 2 ), 
the fraction goes to 0. Even for these, the covariance is bounded by 1. Then the sum divided 
by goes to 0. 

With that, we have bounded the variance by a term that goes to 0, which finishes the 
proof. 


4.3. Characterization of the epidemic curve function. In this section, we prove Propo¬ 
sition 1.5. Recall that adding a superscript {B) or {R) indicates a branching process described 
in Section 2.3 that is started from a blue or red type vertex, respectively. We start with the 
recursive formula for the martingale limit random variables from [5]: 

i=i j=i 


where IT/*** are independent copies of IT**^* = limt_>oo e and W//*** are independent 

copies of IT**’* = ITv, and W, Xj are i.i.d. Exp(l). Denote the moment generating functions 
of IT*®* and IT*®* by , M^(h) respectively. Recall that a blue individual has two red 

and Poi(p) many blue children. Hence 


E 




= (E [exp{z?e-^*^-IE'®>}])^ • E 


exp I'd 


D*®> 


,-\Xi 


W, 


(B) 


(4.4) 


We use law of total expectation with respect to Xi to compute 

pOO 

:= E [exp{i9e-^^-IT*®*}] = / E [exp{de-^*’IT*®*}] e'^’da: 

Jo 

pOO 

= / M^,^R){de~^^)e~^dx 

Jo 

Let defined similarly, with replaced by . Then, the second factor in (4.4) 

can be treated by conditioning on Dg * and using independence: 


n 


(S) 


D 


E 


{B) 




V D 


(B) 
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Taking expectation w.r.t. = Poi(p) yields that 


E 


exp 


{«e1 


{B) 
B 
= 1 




exp{p(j(^)-l)} 


We can rewrite the factor in exponent as 
- 1 = 


poo poo 

Jo Jo 

then the moment generating function in (4.4) becomes 
My^[B){d) = Miy(H)(-de“'^“)e“''da:^ • expjp • y (M^cb) (i?e“^^) - l) e“®dx|. 


Similarly for using that = 1 and = Poi(p), 

poo r poo 

Mwiu, (d) = J Mwm {'de-^^)e-^dx ■ expjp • j - l) e’^dx 

We have just showed that the moment generating functions satisfy the system of equations 
given in Proposition 1.5, and by [5], there exist proper moment generating functions satisfying 
these functional equations. 


5. Central limit theorem for the hopcount 

In this section we prove Theorem 1.2 that states that the hopcount H„(17, V), the number 
of edges along the shortest weight path between two vertices U and V chosen uniformly at 
random, satisfies a central limit theorem with mean and variance both logn. 

For this, we consider the shortest weight path between U and V in two parts: the path 
from U within SWT^(t„) and from V within SWT'^, to the vertex where the connection 
happens. We denote the vertex where the connection happens by Y. These paths are disjoint 
with the exception of Y, hence if suffices to determine their lengths, i.e. the graph distance 
of Y from U and Y from V. Denote by the generation of Y in SWT*^, similarly for 

V. Then the required steps from the root 17 to D is exactly G^^\Y). 

Claim 5.1. The choice ofY is asymptotically independent in the two SWT’s. 

Proof. Conditioned on Y being the connecting vertex, it is uniformly chosen over the active 
set of SWT^. That determines its label, and it determines which particle is chosen in SWT^ 
through the label. Since the labeling is independent of the structure of the family tree, aside 
from the thinning, the choice of Y in SWT^ is independent from its choice in SWT'^. We 
have already bounded the fraction of ghost particles (those who have one of their ancestors 
thinned) by a term that goes to 0 in Lemma 2.12, hence asymptotic independence holds. □ 

With these notations, H„({7, V) = G^^'^(Y) + G‘'^'>{Y), and the two terms are independent. 
We reformulate the theorem using these terms: 

H„(t/,y)-logn _ logn ^ G<^^)(Y) - ^\ogn 

logn ^^logn ^^logn 

Considering that the terms are independent, it suffices to show that both terms on the right 
hand side have normal distribution with mean 0 and variance We show that both terms 
on the rhs of (5.1), multiplied by have standard normal distribution. Due to the method 
we established the connection between SWT^ and SWT'^, the two terms need to be treated 
somewhat differently. 
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5.1. Generation of the connecting vertex in SWT^. Recall that we established the 
connection between SWT*^ and SWT^ in the following way: we grew SWT^ until time tn, 
then we freeze its evolution. Then, we grow SWT'^, and every time a label is assigned to a 
splitting particle, we check if this label belongs to the active set of SWT*^. As a result, the 
connecting vertex R is a particle at some splitting time T^, and hence chosen uniformly over 
the active vertices. This implies that we can use for its generation the indicator decomposition 
of the ancestral line described in Section 2.4.2, for F’s generation as Gk = Efci where 
conditioned on the offspring variables Di, the indicators are independent and have success 
probability P(li = 1 ) = 

In our case the number of splits is a random variable. Recall from Section 3.2 that the 
connection time minus forms a tight random variable, (see e.g. (3.14)), hence till the 
connection there are N(t„ + Z) many explored vertices for some random Z £ K. By Corollary 
2.8, N(t„ + u) = C^/n for some bounded random variable C (that might depend on n, but is 
tight). Denote by 



Then 

y^iogn v^ioei 

Our aim is to show that Lindeberg’s CLT is applicable for Bi, B 2 converges to 1, and R 3 
converges to 0 . 


5.1.1. Term Bi. For this sum of (conditionally independent) indicators, Lindeberg’s condition 
is trivially satisfied if the total variance tends to infinity. To give a lower bound, 

C^/n C-^/n Cy/n 

£ A/5g (1 - A/^g) = E - E (5.4) 

i—1 i—1 i—1 

Recall Lemma 2.11, and split the sum according to the random variable K. Each vertex has 
at least one red child, hence > 1. Then 


Cv'n 

E 


C^fn 


D,IS, > 


,^^^^A(l + o(*-i/ 2 +^))- 


where K is a.s. hnite. The term on the rhs is at least l/(2i), and thus the rhs tends to 
infinity, and is at least logn/(2A). For the second term in (5.4), we can use that the second 
moment of A = Poi(/o) + 1 + explored is blue} can be bounded by some constant M 2 
independent of i. Hence, again cutting the sum at AT, the sum of the first K terms is a.s. 
finite. For the rest, we can use Lemma 2.11 again, and then Markov’s inequality yields: 



Df 


Cy/n 


i=K+l 


(i2A2(l + 0(z-l/2+e)2) - 


^E 


M 2 


log log n < 


1 

log log n' 


(5.5) 
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Note that . Combining the two estimates for the two terms in 

(5.4), we see that the variance tends to infinity w.h.p. As a result, the term Bi in (5.2) 
satisfies a CLT. 


5.1.2. Term B 2 . Similarly as for the term Bi, we cut the sum at K given by Lemma 2.11 
and write 

A/g.(l-A/g.) ^ SLA/g.(l-A/g.) ^ Efj^+iA/g. _ Stf 

^ log n ^ log n log n ^ log n 

The first fraction tends to 0, as the numerator is a.s. finite. For the numerator in the third 
term, we can use (5.5) again, which shows that the third term tends to zero w.h.p. We 
have yet to show that the second term tends to 1. Let = cr{Di ,..., Zl„) be the filtration 
generated by the random variables Di. Then 


E Cy/n 

i— 


i^K+1 


^ V^C^/n Di-¥.[Di\Ti.-,] v^C^/n E[D,|J^,_i] 

Lfi/Oi _ iA(l+o(i-i/2+e)) iA(l+o(z-i/2+£)) 


^logn ^logri ^logn 


(5.6) 


For the first term of the rhs of (5.6), we will use Chebyshev’s inequality. For this, elementary 
calculation using tower rule yields that 


Var 


^ A-E[A|A-i] 

^+1 A(1 + o(i-i/2+e)) 


- A2i2(l + o(i-l/2+e)2 


Since ¥.[Dl\Fi-i\ < M 2 as in Section 5.1.1, we get that the rhs is at most M 2 Tt^/{SX^). Then 
Chebyshev’s inequality yields 


Cy/n 

E 


A - E[A|-^i-i] 

iA(l + o(f“^/ 2 +e)) 


> log log n ■ 


TT^ M 2 

y A^ 


< 


1 

(log log n) 2 ’ 


(5.7) 


This implies that the first term in (5.6) tends to 0 w.h.p. 

Now to show that the second term in (5.6) tends to 1, we use a corollary of Theorem 2.6 

/ gB \ 

(see [5]), stating that the vector ( 5 ^, sw) a-S. Further analysis (in particular, 

the central limit theorem about {Sf^,SP) in [25]) yields that the error term is at most of 
order z“^/ 2 +e^ Hence, using that = Poi(p) +1 +explored is blue} and the definition 
of A it is elementary to show that 

E[A|y-i] = (A + 1)(1 + o(z-i/2+^)) 


Substituting this into the sum, we have 


iA(l+o(i~^/^+')) _ Xii=K+l 




Cy/n 

AT+l 


o(i 


(5.8) 

^ log n log n /2 log n /2 

The first term on the rhs, introducing a constant error term b from the integral approximation, 
equals 


lA _ log(C'Vn) - log(A: + 1) + (5 n-i-y 
logn/2 logn/2 


since C is a tight random variable. The second term in (5.8) is at most EEo is 

summable and finite, hence divided by logn it tends to 0. Combining everything, we get 
that B 2 in (5.2) tends to 1 w.h.p. 
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5.1.3. Term B^. As before, we cut this sum at K given by Lemma 2.11, and the sum of 
the first K terms divided by logn tends to 0 since DijSi < 1. When we consider the rest 
of the sum, we use the approximation of Si (given by Lemma 2.11) and add and subtract 
E[Di\Bi-i] again: 


E C \/n r>^ 

i=K+l 


A+1 

2A 


logn 


jA(l+o(i-i/2+e)) iA(l+o(i-i/2+e)) J 2A 


V^logn V^^logn 

(5.10) 

The numerator of the first term on the rhs has been treated in (5.7) and is w.h.p of order at 
most log logn, hence the first term on the rhs tends to 0 w.h.p. For the second term on the 
rhs of (5.10) we can use (5.8) and (5.9), and then it is at most 


A + 1 ^ log C - \og{K + 1) + (5 + A ^ Q 

^ \/^logn 

almost surely, since C is a tight random variable. This shows that the term in (5.2) tends 
to 0 w.h.p, and finishes the proof of the CLT for the generation of the connecting vertex in 
SWT^, see (5.3). 


5.2. Generation of the connecting vertex in SWT^. For the generation of Y in SWT^, 
we have to use a different approach. This is because the label of the connecting vertex is 
chosen uniformly among the active vertex of SWT^ but is not necessarily uniform over the 
active vertices in SWT^. Indeed, it is a longish but elementary calculation to show that 
conditioned on the event that a connection happens, any active red label in SWT^ is chosen 
with asymptotic probability (A(^)(t„))“^(l — ^)/(l — ^), while any active blue label is 
chosen with asymptotical probability (A(^)(t„))“^l/(1 — where A(^)(<„) is the total 
number of active vertices in SWT^. However, the following claim is still valid and will be 
enough to show the needed CLT: 

Claim 5.2. Conditioned on the connecting vertex having a label of a certain color in SWT^, 
with high probability, it is chosen uniformly at random among the active labels of that color 
in SWT^. 


Proof. We show the statement first for color blue. Recall that a blue label was chosen 
uniformly in [n]. Since the restriction of a uniform distribution to any set is again uniform, 
the probability of connection is the same for any particular blue active label among the 
different labels in the active blue labels in SWT^. Recall that the number of different labels 
(that are neither thinned nor ghost) is called the effective size and it is treated in Corollary 
2.15. 

The problem here is though, that some labels in the branching process approximation 
are multiply active, and these are neither thinned nor called ghosts, and if chosen, they 
modify the uniform probability for the connection.^ However, Corollary 2.15 implies that 
the fraction of multiple actives tends to 0 at time tn- Hence if we pick an active label in 
A^{tn) it has multiplicity 1 w.h.p., including red and blue instances. (This also implies 


^Consider a label Vi in the BP of SWT^ that is active rrii times, i.e., there are m; individuals in the BP 
having label Vi. The label Vi in SWT'^ is still chosen only with the same probability, (that is, 1/n). Since 
which one of the rrii individuals has the minimal remaining lifetime is uniformly distributed, every individual 
with label Vi has probability to be the connecting vertex, conditioned on connection at a 

blue label (with being the effective size of blue labels). 
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that asymptotically, the label has a well defined color.) Hence with high probability, at the 
connecting vertex, we have a uniform distribution over all possible blue active labels. 

An analogous argument can be carried through for red active labels as well, using the fact 
that the centre of the interval where they belong to is chosen uniformly, and the fact that 
multiple red labels have proportion tending to 0 at time □ 

To finish the central limit theorem of GjjiX), we use a general result of Kharlamov 
[26] about the generation of a uniformly chosen active individual in a given type-set in a 
multi-type branching process. For this, consider a type set 5 of a multi-type branching 
process, and let As = ^q^sAq the set of active individuals with any type from the type-set 
S. Then, [26, Theorem 2] states that the generation of a uniformly chosen individual in As 
satisfies a central limit theorem with asymptotic mean and variance that is independent of 
the choice of S.^ 

To apply this result, first pick S := {R, B} in our case. Then, the statement simply turns 
into a CLT of the generation of a uniformly picked active individual. We have seen when 
treating that the asymptotic mean and variance are both logn in this case. 

Now apply the result again for 5 := {R} and S := {B}, separately. Combined with 
the previous observation, we get that an individual chosen uniformly at random with color 
blue/red, respectively, also satisfies a CLT with the same asymptotic mean and variance. 
This, combined with Claim 5.2, implies that whether Y is red or blue in SWT^, its generation 
G^^i (T) admits a central limit theorem with mean and variance log n. This completes 
the proof of Theorem 1.2. 
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